LINGUISTIC DESCRIPTION IN DICTIONARIES: SEMANTICS Extraction of semantic relations from a Basque monolingual dictionary using Constraint Grammar

نویسندگان

  • Eneko AGIRRE
  • Olatz ANSA
  • Xabier ARREGI
  • Xabier ARTOLA
  • Arantza DÍAZ DE ILARRAZA
  • Mikel LERSUNDI
  • David MARTÍNEZ
  • Kepa SARASOLA
  • Ruben URIZAR
چکیده

This paper deals with the exploitation of dictionaries for the semi-automatic construction of lexicons and lexical knowledge bases. The final goal of our research is to enrich the Basque Lexical Database with semantic information such as senses, definitions, semantic relations, etc., extracted from a Basque monolingual dictionary. The work here presented focuses on the extraction of the semantic relations that best characterise the headword, that is, those of synonymy, antonymy, hypernymy, and other relations marked by specific relators and derivation. All nominal, verbal and adjectival entries were treated. Basque uses morphological inflection to mark case, and therefore semantic relations have to be inferred from suffixes rather than from prepositions. Our approach combines a morphological analyser and surface syntax parsing (based on Constraint Grammar), and has proven very successful for highly inflected languages such as Basque. Both the effort to write the rules and the actual processing time of the dictionary have been very low. At present we have extracted 42,533 relations, leaving only 2,943 (9%) definitions without any extracted relation. The error rate is extremely low, as only 2.2% of the extracted relations are wrong.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extraction of semantic relations from a Basque monolingual dictionary using Constraint Grammar

This paper deals with the exploitation of dictionaries for the semi-automatic construction of lexicons and lexical knowledge bases. The final goal of our research is to enrich the Basque Lexical Database with semantic information such as senses, definitions, semantic relations, etc., extracted from a Basque monolingual dictionary. The work here presented focuses on the extraction of the semanti...

متن کامل

Measuring the Semantic Distance between Languages from a Statistical Analysis of Bilingual Dictionaries

A bilingual dictionary is a valuable linguistic resource which records, among other things, the di erences in the segmentation of semantic space by the two languages and hence the di culty in producing faithful translations between the two languages. Statistical analysis of nearly a hundred dictionaries has allowed us to determine how best to measure the semantic distance between languages from...

متن کامل

Cross-language Semantic Relations between English and Portuguese∗ Relaciones Semánticas entre los Idiomas Inglés y Portugués

This paper describes conceptual semantic relations obtained from OpenLogos resources converted into NooJ format. These relations were symbolically represented in the OpenLogos lexicon as a taxonomic scheme called semantico-syntactic abstraction language (SAL), used to generate hierarchical hyponymy and hypernymy relations. The paper also describes action-of, result-of, and synonymy relations be...

متن کامل

Constructing an intelligent dictionary help system

This paper shows different issues in the construction and knowledge representation of an intelligent dictionary help system. IDHS (Intelligent Dictionary Help System) is conceived as a monolingual (explanatory) dictionary system for human use (Artola & Evrard, 92). The fact that it is intended for people instead of automatic processing distinguishes it from other systems dealing with the acquis...

متن کامل

Monolingual and bilingual dictionary approaches to the enrichment of the Spanish WordNet with adjectives

We report on two different approaches to the incorporation of adjectives in Spanish WordNet based on automatic extraction techniques using EuroWordNet and machine-readable dictionaries. We show that a monolingual dictionary approach enables to exploit relations between different parts of speech and enrich the internal structure of the Spanish WordNet, while the methods based on bilingual dictio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000